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Remarks 

Reconsideration of the application is respectfully requested in view of the 
foregoing amendments and following remarks. With entry of amendments included 
herein, claims 1-2 and 4-37 are pending in this application. Claims 1, 9, 23, 31, 36 and 
37 are independent. No claims have been allowed. Claims 1-2, 23-24, and 27 have been 
amended and claim 3 has been canceled. 

Independent claims 9, 31, 36 and 37 have not been amended. Accordingly, "a 
second or any subsequent action on the merits in any application or patent undergoing 
reexamination proceedings will not be made final if it includes a rejection, on newly cited 
art, other than information submitted in an information disclosure statement filed under 
37 CFR 1 .97(c) with the fee set forth in 37 CFR 1.17 (p), of any claim not amended by 
applicant or patent owner in spite of the fact that other claims may have been amended to 
require newly cited art." See, MPEP § 706.07(a). 

Request for Interview 
If any issues remain, the Examiner is formally requested to contact the 
undersigned attorney prior to issuance of the next Office Action in order to arrange a 
telephonic interview. It is believed that a brief discussion of the merits of the present 
application may expedite prosecution. Applicants submit the foregoing formal 
Amendment so that the Examiner may fully evaluate Applicants' position, thereby 
enabling the interview to be more focused. 

This request is being submitted under MPEP § 713.01, which indicates that 
an interview may be arranged in advance by a written request. 

Patentability of Claims 17, 20 and 23-30 over Jain in view of Lee under 103(a) 

Claims 17, 20, and 23-30 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over USP No. 5, 729, 471 to Jain et al. ("Jain") in view of USP No. 
5,612,743 to Lee ("Lee"). Applicants respectfully submit that the claims in their present 
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form are allowable over the applied art. To establish a prima facie case of obviousness, 
three basic criteria must be met. First, there must be some suggestion or motivation, 
either in the references themselves or in the knowledge generally available to one of 
ordinary skill in the art, to modify the reference or to combine reference teachings. 
Second, there must be a reasonable expectation of success. Finally, the prior art 
reference (or references when combined) must teach or suggest all the claim limitations. 
(MPEP§2142.). 

Independent Claim 23 

Amended claim 23 recites as follows: 

A method of recovering a three-dimensional scene from a sequence of two- 
dimensional frames, comprising: 

(a) identifying at least a first base frame in a sequence of two- 
dimensional frames; 

(b) adding the at least first base frame to create a first segment of 
the sequence; 

(c) identifying feature points in at least the first base frame in the 
first segment; 

(d) analyzing a next frame in the sequence to identify the feature 
points in the next frame; 

(e) determining whether a threshold number of feature points 
from the base frame are identified in the next frame; 

(f) if a threshold number of feature points are identified in the 
next frame, adding the next frame to the first segment; and 

(g) repeating (d) through (f) for subsequent frames until the 
number of feature points in a frame falls below the threshold number. 

The applied references, Jain and Lee, both individually and in combination, fail to 
teach or suggest many aspects of claim 23. For instance, combining the method of 
capturing video at a standard 30 frames per second rate as taught by Jain with the motion 
estimation technique taught by Lee (based on pixel comparison between frames) fails to 
teach or suggest "a method of recovering a three-dimensional scene from a sequence of 
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two-dimensional frames, comprising. .. determining whether a threshold number of 
feature points from the base frame are identified in the next frame: if a threshold number 
of feature points are identified in the next frame, adding the next frame to the first 
segment . " 

However, the Applicants' specification at Pg. 9, Ln. 4 - Pg. 10, Ln. 18 recites an 

exemplary method of ''determining whether a threshold number of feature points from 

the base frame are identified in the next frame; if a threshold number of feature points 

are identified in the next frame, adding the next frame to the first segment" as follows: 

Segmenting the Input Sequence 

FIG. 4 is a flow chart of a method for dividing the input sequence of images into 
segments. In a first box 90, a first frame in a segment is identified. For example, frame 
62 (FIG. 3) is a first frame in the input sequence of images and is selected as the first 
frame for the first segment. Typically, the first frame in the segment is chosen as a base 
frame. 

In box 92, feature points are identified for the base frame or frames in the 
segment. There are several techniques to identifying feature points. One technique is to 
scan through pixels in the image and identify any pixels that are comers of the image. 
The corners are then designated as feature points. Where the image corners are feature 
points, only one frame needs to be used as a base frame. For example, frame 62 could be 
used as the base frame. 

Another technique that may be used to identify feature points is motion 
estimation. With motion estimation, two or more frames are used to identify feature 
points. A potential feature point as identified in a first frame, such as frame 62 (FIG. 3), 
and then a next frame, such as frame 63 is analyzed to see if the feature point from frame 
62 can be tracked to frame 63. Motion estimation is further described below in relation 
to FIG. 6. Whether using motion estimation or identifying corners, the feature points in 
one or more base frames for the segment are identified in box 92. 

In box 94, a next frame in the segment is obtained to compare with the base frame 

or frames. For example, frame 64 may be the next frame. In box 96, feature points are 
identified in the frame 64 and are compared to the feature points in the base frame or 
frames. In decision 98, a determination is made whether a threshold number of feature 
points are tracked to the frame being analyzed (e.g., frame 64) from the base frames. The 
threshold number of points may be 60%, but other desired thresholds can be used. If 
frame 64 contains more than the threshold number of feature points then the frame is 
added to this segment (box 100). In the present example, frame 64 is added to segment 1. 
The next frame (not shown) in the input sequence of images is then used to determine 
whether it should be added to the segment. As indicated by arrow 102, each frame in an 
input sequence is taken in order and analyzed to determine whether it contains a threshold 
number of feature points tracked from the base frame or frames. If so, the frame is added 
to the current segment. At some point, however, a frame will not contain the threshold 
number of feature points and the decision made in decision 98 will be negative. At that 
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point, a decision is made whether this is the final segment in the input sequence of 
images (decision 104). If it is the final segment (segment N), then the segmenting is 
complete (box 106). If however, there are more images in the input sequence, then the 
current segment is ended and the next segment is started (box 108). Typically, the last 
frame in the previous segment is also used as a base frame in the next segment. It also 
may be desirable to overlap several frames in each segment. Once the base frame is 
identified for the next segment, arrow 110 indicates that the process starts over for this 
next segment. Again, feature points are tracked with respect to the new base frame in the 
next segment. Because the number of frames depends on the feature points tracked 
between the frames, the segments can vary in length . (Emphasis added). 

The Action relies on Jain and Lee. First of all, as the Action agrees, nothing in 
Jain teaches or suggests "determining whether a threshold number of feature points from 
the base frame are identified in the next frame; if a threshold number of feature points 
are identified in the next frame, adding the next frame to the first segment" as recited in 
Applicants' claim 23. See, Action at Pg. 14, Para 2. Although, Lee at Col. 3, Lns. 12-16 
recites "comparing on a pixel-by-pixel basis the differential pixel value with a threshold 
value TH and selecting one or more regions, each of the selected regions consisting of 
. . .differential pixel values larger than threshold value TH" that does not teach or suggest 
"if a threshold number of feature points are identified in the next fram e, adding the next 
frame to the first segment " as recited in applicants' claim. This is so at least because 
nothing in Lee teaches or suggests "adding the next frame to the first segment" All Lee 
teaches is that a current frame being encoded is compared "on a pixel-by-pixel basis" to a 
"reference frame" and based on the comparison " selecting one or more regions each of 
the selected regions consisting of . . .differential pixel values larger than threshold value 
TH." See, Lee, Col. 3, Lns. 2-16. Thus, nothing in Lee teaches or suggests " adding the 
next frame to the first segment " In fact, the set of frames being analyzed in Lee is 
predetermined according to the signal stream being encoded. See, Lee at Col. 2, Ln. 62- 
Col. 3, Ln. 31. Lee does not teach "adding" frames to this predetermined set of frames 
let alone "adding" based on "if a threshold number of feature points are identified in the 
next frame, adding the next frame to the first segment. " 
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Jain too fails to teach or suggest "adding the next frame to the first segment. " In 
fact, as the Action itself admits, the fixed 30 frames per second is "the standard NTSC 
frame rate." See, Action e.g., Pg. 3 (emphasis added). This fails to teach or suggest 
"adding the next frame to the first segment }r to change the segments in any form let alone 
based on "if a threshold number of feature points are identified in the next frame, adding 
the next frame to the first segment" as recited in Applicants' claim 23. 

Since the applied references, do not teach or suggest at least one element of claim 
23, claim 23 in its present form should be allowed. 

Dependent claims 24-30 

Claims 24-30 depend on claim 23 and thus, at least for the reasons set forth above 
with respect to claim 23, claims 24-30 should be allowed. 

Dependent claims 1 7 and 20 

Claims 17 and 20 ultimately depend from claim 9 and thus, at least for the reasons 
set forth below with respect to claim 9, claims 17 and 20 should be allowed. Each of 
claims 17 and 20 also recite independently patentable features and thus, should be 
allowed for that reason. 

The Action relies on Jain and Lee, and rejects claim 17 based on the same 
rationale as the aforementioned claim 23. See, Action at Pgs. 13-15 (grouping claim 17 
rejection with claims 23, 24 and 28). The Applicants disagree, at least for the reason that 
claim 17 recites a number of elements that are not found in claim 23, 24 and 28 and vice 
versa. For instance, claim 17 recites as follows: 

The method of claim 9 wherein encoding includes: 

choosing at least two frames in the segment that are at least a threshold number of 
frames apart; for each of the at least two chosen frames, 

projecting a plurality of three-dimensional points into a corresponding virtual 
frame; and 

for each of the at least two chosen frames, projecting an uncertainty into the 
corresponding virtual frame 
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None of these elements of Applicants' claim 17 listed above are recited in claim 
23. As a result, Applicants submit that, grouping the rejection of claim 17 with claim 23 
and more particularly, rejecting claim 17 for the same reasons as claim 23 is improper. 

Nevertheless, the applied references, Jain and Lee, both individually and in 
combination, fail to teach or suggest many aspects of claim 17. For instance, Jain and 
Lee fail to teach or suggest " choosing at least two frames in the segment that are at least 
a threshold number of frames apart; for each of the at least two chosen frames" as recited 
in Applicants' claim 17. If as the Action alleges, having "one key frame ... manually 
selected for every thirty frames", teaches "segments" then all that Jain teaches is that 
" one key frame has been manually selected for every" segment. This is not the same as 
"choosing at least two frames in the segment" let alone those "that are at least a 
threshold number of frames apart " as recited by Applicants' claim 17. 

Lee too fails to teach or suggest "choosing at least two frames in the segment that 
are at least a threshold number of frames apart " All that Lee teaches is a motion 
estimation technique for encoding video signals "having a plurality of frames including a 
current frame and a reference frame." Nothing in Lee teaches or suggests that the 
"current frame and a reference frame" have to be "at least a threshold number of frames 
apart" as recited by Applicants' claim 17. 

Since the applied references, do not teach or suggest at least one element of claim 
17, claim 17 in its present form should be allowed. 

Patentability of Claims 1-16, 18, 19, 21, 22, 36, and 37 over Jain under 102(b) 

Claims 1-16, 18, 19, 21, 22, 36, and 37 are rejected under 35 U.S.C. 102(b) as 
being anticipated by Jain. Applicants respectfully traverse the rejection. The claims in 
their present form are allowable over Jain. For a 102(e) rejection to be proper, the 
applied art must show each and every element as set forth in a claim. (See, MPEP § 
2131.01) However, Jain fails to do so. 
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Independent claim 1 

Amended claim 1 recites as follows: 

A method of recovering a three-dimensional scene from two-dimensional images, 
the method comprising: 

providing a sequence of frames; 

dividing the sequence of frames into frame segments wherein the frames in the 
sequence comprise feature points and wherein dividing the sequence of frames into frame 
segments is based upon at least a threshold number of feature points being tracked 
between the frames of the frame segments; 

performing three-dimensional reconstruction individually for each frame segment 
derived by dividing the sequence of frames; and 

combining the three-dimensional reconstructed segments together to recover a 
three-dimensional scene for the sequence of images. 

The applied reference Jain fails to teach or suggest many aspects of Applicants' 
claim 1 . For instance, setting a fixed 30 frame per second video frame capture rate 
according to a "NTSC standard" for a camera as taught by Jain fails to teach or suggest 
"dividing the sequence of frames into frame segments wherein the frames in the sequence 
comprise feature points and wherein dividing the sequence of frames into frame segments 
is based upon at least a threshold number of feature points being tracked between the 
frames of the frame segments . " 

As noted above with respect to claim 23, the specification at Pg. 9, Ln. 4 - Pg. 10, 

Ln. 18 recites an exemplary method of "dividing the sequence of frames into frame 

segments . . . wherein dividing the sequence of frames into frame segments is based upon 

at least a threshold number of feature points being tracked between the frames of the 

frame segments . " The Action relies on Jain's teaching at Col. 23, Lns 58-67 as follows: 

Ideally the scene analysis process just described should be applied to every video frame 
in order to get the most precise information about (i) the location of players and (ii) the 
events in the scene. However, it would require significant human and computational 
effort to do so in the rudimentary, prototype, MPI video system because feature points are 
located manually, and not by automation. Therefore, one key frame has been manually 
selected for every thirty frames, and scene analysis has been applied to the selected key 
frames. (Emphasis added). 
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Thus, what Jain teaches is simply that " one key frame has been manually selected 

for every thirty frames ." See, id. This does not lead one of ordinary skill in the art to 

"dividing the sequence of frames into frame segments. " Even assuming it does, nothing 

in Jain suggests any criteria based selection process for deciding how to determine which 

30 frames are in its alleged "segment." In fact, there appears to be no selection at all in 

Jain. Instead, all that Jain teaches is always capturing the next 30 frames in time. 

However, regarding an exemplary criteria-based "dividing" method, the Applicants' 

specification at Pg. 10, Lns. 3-18 states as follows: 

As indicated by arrow 102, each frame in an input sequence is taken in order and 
analyzed to determine whether it contains a threshold number of feature points tracked 
from the base frame or frames. If so, the frame is added to the current segment. . . . 
Again, feature points are tracked with respect to the new base frame in the next segment. 
Because the number of frames depends on the feature points tracked between the frames, 
the segments can vary in length. 

Thus, in one significant difference with Jain, applying the method of Applicants' 
claim 1 can result in "segments" that vary in the number of frames therein whereas all 
that Jain teaches is capturing "30 frames per second. Thus, Jain fails to teach or suggest 
"wherein dividing the sequence of frames into frame segments is based upon at least a 
threshold number of feature points being tracked between the frames of the frame 
segments' 9 as recited by Applicants' claim 1. 

Although, the Action does not rely on Lee for this rejection, Applicants submit 
that it too fails to teach or suggest "wherein dividing the sequence of frames into frame 
segments is based upon at least a threshold number of feature points being tracked 
between the frames of the frame segments. " All Lee teaches is that "a current frame" 
being encoded is compared "on a pixel-by-pixel basis" to a "reference frame" and based 
on the comparison " selecting one or more regions each of the selected regions consisting 
of ...differential pixel values larger than threshold value TH." See, Lee, Col. 3, Lns. 2- 
16. This does not teach or suggest "dividing the sequence of frames into frame 
segments " at least because Lee is silent as to "segments" since the frames in Lee are from 
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a "video signal" stream and nothing in Lee teaches or suggests that this sequence be 
altered let alone by "dividing the sequence of frames into frame segments." 

Thus, by combining the teachings of Jain with Lee one would arrive at Jain's fixed 
"standard" 30 frames and a Lee's method of motion estimation based on comparing "a 
current frame and a reference frame." This fails to teach or suggest "dividing the 
sequence of frames into frame segments wherein the frames in the sequence comprise 
feature points and wherein dividing the sequence of frames into frame segments is based 
upon at least a threshold number of feature points being tracked between the frames of 
the frame segments . " 

Since the cited references, do not teach or suggest at least one element of claim 1, 
claim 1 in its present form should be allowed. 

Dependent claims 2 and 4-7 

Claims 2 and 4-7, ultimately depend on claim 1 and thus, at least for the reasons 
set forth above with respect to claim 1, claims 2 and 4-7 should be also be allowed. 
Furthermore, each of the claims 2 and 4-7 also recite independently patentable features 
and thus, should be allowed for that reason. 

For instance, the applied references fail to teach or suggest "wherein performing 

includes creating at least two virtual key frames for each of the segments, wherein the 

virtual key frames are only a subset of the images in a segment but are a representation 

of all of the images in that segment' as recited by Applicants' claim 2. The Applicants' 

specification describes an exemplary method for creating '' virtual key frames " as follows 

atPg. 19, Lns. 4-18: 

Two virtual frames are used to represent each segment because at least two frames are 
desired for 3D reconstruction. The position of a virtual frame can coincide with one real 
frame, say k. Alternatively, the virtual frame does not have to coincide with a real frame. 
A virtual frame contains the pro jection of the 3D reconstructed points, denoted by 
Uik=\u^ Vjj*|, and its co variance matrix Au/* . (Emphasis added.). 
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The Action relies on Jain and notes that "Jain discloses the use of virtual key 
frames (col.23, In. 58 to col.24, In. 3)." See, Action Pg. 4. However, as the Action itself 
notes what Jain teaches is selecting " one key frame from every 30 frames, i.e. segment of 
sequence of frames." Thus, "one key frame from every 30 frames" does not teach or 
suggest " at least two virtual key frames for each of the segments " let alone " virtual key 
frames " that "are a representation of all of the images in that segment. " 

Since the cited references, do not teach or suggest at least one element of claim 2, 
claim 2 in its present form should be allowed. 

Independent claim 9 

Claim 9 recites as follows: 

A method of recovering a three-dimensional scene from two-dimensional images, 
the method comprising: 

identifying a sequence of two-dimensional frames that include two- 
dimensional images; 

dividing the sequence of frames into segments, wherein a segment includes 
a plurality of frames; 

for each segment, encoding the frames in the segment into at least two 
virtual frames that include a three-dimensional structure for the segment and an 
uncertainty associated with the segment. 

The applied reference Jain fails to teach or suggest many aspects of Applicants' 
claim 9. For instance, Jain fails to teach or suggest "a method of recovering a three- 
dimensional scene from two-dimensional images, the method comprising.., dividing the 
sequence of frames into segments, ... for each segment, encoding the frames in the 
segment into at least two virtual frames that include a three-dimensional structure for the 
segment and an uncertainty associated with the segment " as recited in Applicants' claim 
9. However, the Applicants' specification at at Pg. 19, Lns. 4-18 describes an exemplary 
method of "encoding the frames in the segment into at least two virtual frames " as 
follows: 

Two virtual frames are used to represent each segment because at least two frames are 
desired for 3D reconstruction. The position of a virtual frame can coincide with one real 
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frame, say k. Alternatively, the virtual frame does not have to coincide with a real frame. 
A virtual frame contains the projection of the 3D reconstructed points, denoted by 
%=Iu/k Vj k], and its covariance matrix Au^ . (Emphasis added.). 

The Action relies on Jain and states "Jain discloses the extraction of key frames by 
selecting one key frame from every 30 frames ." See, Action Pg. 7. First of all, the Jain's 
teaches selecting " one key frame ... for every thirty frames" which is not the same as "for 
each segment, encoding the frames in the segment into at least two virtual frames that 
include a three-dimensional structure for the segment and an uncertainty associated with 
the segment, " 

Furthermore, what Jain teaches is a selecting "one key frame . . .for every thirty 
frames, and scene analysis has been applied to the selected key frames" wherein "scene 
analysis" comprises "Extracting Three-dimensional Information." This is not the same as 
" encoding the frames in the segment into at least two virtual frames that include a three- 
dimensional structure for the segment and an uncertainty associated with the segment " 
as recited in Applicants' claim 9. More particularly, "extracting" "three-dimensional 
information" from a single "key frame" as taught by Jain is not the same as "encoding... 
into at least two virtual frames a three-dimensional structure for the segment " 

Thus, Jain fails to teach or suggest at least one element of claim 9 and since the 
cited reference, does not teach or suggest at least one element of claim 9, claim 9 in its 
present form should be allowed. 

Dependent claims 10-22 

Claims 10-22, ultimately depend on claim 9 and thus, at least for the reasons set 
forth above with respect to claim 9, claims 10-22 should be also be allowed. 

Independent claim 36 

Claim 36 recites as follows: 

A computer-readable medium having computer-executable instructions for 
performing a method comprising: 

providing a sequence of two-dimensional frames; 
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dividing the sequence into segments; 

calculating a partial model for each segment that includes three- 
dimensional coordinates and camera pose for features within the frames; 

extracting virtual key frames from each partial model, the virtual key 
frames having three-dimensional coordinates for the frames and an uncertainty 
associated with the frames; and 

bundle adjusting the virtual key frames to obtain a complete three- 
dimensional reconstruction of the two-dimensional frames. 

The applied reference Jain fails to teach or suggest many aspects of Applicants' 

claim 36. For instance, Jain fails to teach or suggest " calculating a partial model for 

each segment that includes three-dimensional coordinates and camera pose for features 

within the frames ; extracting virtual key frames from each partial model " However, the 

Applicants' specification at Pg. 8, Ln. 7 to Pg. 9, Ln. 2 describes an exemplary method 

for "calculating a partial model for each segment n and "extracting virtual key frames 

from each partial modeV* as follows: 

For each segment, a partial model is created. For example, for segment 1, a partial model 
70 is shown whereas for segment N, a partial model 72 is shown. The partial model 
contains the same number of frames as the segment it represents . Thus, continuing the 
example above, the partial model 70 contains 100 frames and the partial model 72 
contains 110 frames. The partial models provide three-dimensional coordinates and 
camera pose for the feature points . The partial model may also contain an uncertainty 
associated with the segment. 

From each partial model, at least two virtual key frames are generated . For 
example, partial model 70 is represented by two virtual key frames 74 and partial model 
72 is represented by two virtual key frames 76. Additional virtual key frames (e.g., 3, 4, 
5, etc.) may be used for each segment, but for ease of illustration only two virtual key 
frames are used. The virtual key frames are representative 3D frames for each segment. 
The virtual key frames essentially encode the 3-D structure for each segment along with 
this uncertainty. However, because the virtual key frames are a small subset of a total 
number of frames in each segment, an efficient bundle adjustment can be performed for 
all the segments to obtain the final 3-D reconstruction 78. (Emphasis added). 

Thus, as noted above "partial model" and the "virtual frames" represent exemplary 
abstractions of the actual frames in their respective segments. The Action relies on Jain's 
teaching at Col. 23, Lns 58-67 as follows: 

Ideally the scene analysis process just described should be applied to every video frame 
in order to get the most precise information about (i) the location of players and (ii) the 
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events in the scene. However, it would require significant human and computational 
effort to do so in the rudimentary, prototype, MPI video system because feature points are 
located manually, and not by automation. Therefore, one key frame has been manually 
selected for every thirty frames, and scene analysis has been applied to the selected key 
frames. (Emphasis added). 

Thus, the "key frame" as taught by Jain is "manually selected for every thirty 
frames" wherein the 30 frames are from a raw feed of the camera. This is not the same as 
"calculating a partial model for each segment" and" extracting virtual key frames from 
each partial model " This so, at least because Jain fails to teach or suggest creating "a 
partial model for each segment . " If as the Action alleges Jain's "30 frames is a segment" 
then Jain fails to teach " calculating a partial model for each segment " More 
particularly, since Jain teaches selecting "one key frame ... for every thirty frames" it 
fails to teach or suggest "extracting virtual key frames from each partial model " This is 
so at least because selecting "one key frame ... for every thirty frames" (i.e., directly 
from the alleged segment") is not the same as "extracting virtual key frames from each 
partial model " wherein "a partial model" is calculated "for each segment " as recited in 
claim 36. 

Thus, Jain fails to teach or suggest at least one element of claim 36 and since the 
cited reference does not teach or suggest at least one element of claim 36, claim 36 in its 
present form should be allowed. 

Independent claim 37 

Claim 37 recites as follows: 

An apparatus for recovering a three-dimensional scene from a sequence of 
two-dimensional frames by segmenting the frames, comprising: 
means for capturing two-dimensional images; 
means for dividing the sequence into segments; 

means for calculating a partial model for each segment that includes three- 
dimensional coordinates and camera pose for features within the frames; 

means for extracting virtual key frames from each partial model; and 
means for bundle adjusting the virtual key frames to obtain a complete 
three-dimensional reconstruction of the two-dimensional frames. 
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The applied reference Jain fails to teach or suggest many aspects of Applicants 5 claim 37. 
For instance, at least for the reasons listed above with respect to claim 36, Jain fails to 
teach or suggest " calculating a partial model for each segment that includes three- 
dimensional coordinates and camera pose for features within the frames ; ...extracting 
virtual key frames from each partial model . " 

Thus, Jain fails to teach or suggest at least one element of claim 37 and since the 
cited reference does not teach or suggest at least one element of claim 37, claim 37 in its 
present form should be allowed. 

Patentability of Claims 31-35 over Jain under 103(a) 

Claims 31-35 are rejected under 35 U.S.C. 103(a) as being unpatentable over Jain. 
Applicants respectfully submit that the claims in their present form are allowable over the 
applied art. To establish a prima facie case of obviousness, three basic criteria must be 
met. First, there must be some suggestion or motivation, either in the references 
themselves or in the knowledge generally available to one of ordinary skill in the art, to 
modify the reference or to combine reference teachings. Second, there must be a 
reasonable expectation of success. Finally, the prior art reference (or references when 
combined) must teach or suggest all the claim limitations. (MPEP § 2142.). 

Independent claim 31 

Claim 31 recites as follows: 

In a method of recovering a three-dimensional scene from a sequence of two- 
dimensional frames, an improvement comprising dividing a long sequence of 
frames into segments and reducing the number of frames in each segment by 
representing the segments using between two and five representative frames per 
segment, wherein the representative frames are used to recover the three- 
dimensional scene and remaining frames are discarded. 
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The applied reference Jain fails to teach or suggest many aspects of Applicants' 
claim 31. For instance, Jain fails to teach or suggest "dividing a long sequence of frames 
into segments and reducing the number of frames in each segment by representing the 
segments using between two and five representative frames per segment " The Action 
relies on Jain which teaches selecting " one key frame from every 30 frames." The Action 
further states that "Jain does not specifically disclose the reducing the number of frames 
in each segment by representing the segments using between two and five representative 
frames per second. However, Jain discloses the manual adjustment of the number of key 
frames" and points to Col. 23, Ln 64 to Col. 24, Ln. 3 of Jain. Applicants disagree. What 
Jain states is that "one key frame" can be " manually selected ." However, "manual 
selection" is not the same as "manual adjustment" as the Action would have us believe 
and more particularly "manual selection" of "one key frame" does not lead one to "using 
between two and five representative frames per segment" as recited in Applicants' claim 
31. 

Thus, Jain fails to teach or suggest at least one element of claim 3 1 and since the 
cited reference does not teach or suggest at least one element of claim 3 1 , claim 3 1 in its 
present form should be allowed. 

Dependent claims 32-35 

Claims 32-35, ultimately depend on claim 3 1 and thus, at least for the reasons set 
forth above with respect to claim 31, claims 32-35 should be also be allowed. 
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Conclusion 



The claims in their present form should now be allowable. Such action is 
respectfully requested. 
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